Overview

Dataset statistics

Number of variables14
Number of observations786600
Missing cells24767
Missing cells (%)0.2%
Duplicate rows546
Duplicate rows (%)0.1%
Total size in memory90.0 MiB
Average record size in memory120.0 B

Variable types

NUM10
CAT2
BOOL2

Warnings

Dataset has 546 (0.1%) duplicate rows Duplicates
customer_id has a high cardinality: 245455 distinct values High cardinality
order_date has a high cardinality: 776 distinct values High cardinality
customer_order_rank has 24767 (3.1%) missing values Missing
voucher_amount is highly skewed (γ1 = 30.39394065) Skewed
platform_id is highly skewed (γ1 = -22.53663783) Skewed
voucher_amount has 743462 (94.5%) zeros Zeros
delivery_fee has 597536 (76.0%) zeros Zeros

Reproduction

Analysis started2020-10-10 11:29:18.310014
Analysis finished2020-10-10 11:32:38.396230
Duration3 minutes and 20.09 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

customer_id
Categorical

HIGH CARDINALITY

Distinct245455
Distinct (%)31.2%
Missing0
Missing (%)0.0%
Memory size6.0 MiB
15edce943edd
 
386
8745a335e9cf
 
288
d956116d863d
 
286
0063666607bb
 
273
ae60dce05485
 
270
Other values (245450)
785097 
ValueCountFrequency (%) 
15edce943edd386< 0.1%
 
8745a335e9cf288< 0.1%
 
d956116d863d286< 0.1%
 
0063666607bb273< 0.1%
 
ae60dce05485270< 0.1%
 
a54a8e1579d4254< 0.1%
 
bebb751d49b8253< 0.1%
 
26ed6389a3aa245< 0.1%
 
ef6265f74aca229< 0.1%
 
a333fb175a0c221< 0.1%
 
Other values (245445)78389599.7%
 
2020-10-10T14:32:41.083693image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique145498 ?
Unique (%)18.5%
2020-10-10T14:32:42.058585image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length12
Median length12
Mean length12
Min length12

order_date
Categorical

HIGH CARDINALITY

Distinct776
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size6.0 MiB
2017-01-01
 
4230
2016-12-18
 
3395
2017-02-26
 
3234
2017-02-05
 
3218
2017-02-12
 
3125
Other values (771)
769398 
ValueCountFrequency (%) 
2017-01-0142300.5%
 
2016-12-1833950.4%
 
2017-02-2632340.4%
 
2017-02-0532180.4%
 
2017-02-1231250.4%
 
2016-12-1131000.4%
 
2016-12-0430750.4%
 
2017-01-2230050.4%
 
2017-01-2930030.4%
 
2016-10-0329990.4%
 
Other values (766)75421695.9%
 
2020-10-10T14:32:42.467952image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique41 ?
Unique (%)< 0.1%
2020-10-10T14:32:42.880503image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length10
Mean length10
Min length10

order_hour
Real number (ℝ≥0)

Distinct24
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.58879608
Minimum0
Maximum23
Zeros4627
Zeros (%)0.6%
Memory size6.0 MiB
2020-10-10T14:32:43.215483image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile12
Q116
median18
Q320
95-th percentile22
Maximum23
Range23
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.357192477
Coefficient of variation (CV)0.1908710785
Kurtosis5.749711941
Mean17.58879608
Median Absolute Deviation (MAD)2
Skewness-1.749088644
Sum13835347
Variance11.27074133
MonotocityNot monotonic
2020-10-10T14:32:43.564367image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%) 
1913403017.0%
 
1812965416.5%
 
2010873913.8%
 
179078211.5%
 
21682238.7%
 
16488776.2%
 
15342864.4%
 
22334034.2%
 
13311054.0%
 
14303233.9%
 
Other values (14)771789.8%
 
ValueCountFrequency (%) 
046270.6%
 
124250.3%
 
211870.2%
 
34430.1%
 
4137< 0.1%
 
ValueCountFrequency (%) 
23138321.8%
 
22334034.2%
 
21682238.7%
 
2010873913.8%
 
1913403017.0%
 

customer_order_rank
Real number (ℝ≥0)

MISSING

Distinct369
Distinct (%)< 0.1%
Missing24767
Missing (%)3.1%
Infinite0
Infinite (%)0.0%
Mean9.436809642
Minimum1
Maximum369
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-10T14:32:43.988045image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q310
95-th percentile39
Maximum369
Range368
Interquartile range (IQR)9

Descriptive statistics

Standard deviation17.77232218
Coefficient of variation (CV)1.88329773
Kurtosis49.04720204
Mean9.436809642
Median Absolute Deviation (MAD)2
Skewness5.494014541
Sum7189273
Variance315.8554356
MonotocityNot monotonic
2020-10-10T14:32:44.518780image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
124493731.1%
 
29664112.3%
 
3605327.7%
 
4436815.6%
 
5340364.3%
 
6276033.5%
 
7230492.9%
 
8196962.5%
 
9170132.2%
 
10148891.9%
 
Other values (359)17975622.9%
 
(Missing)247673.1%
 
ValueCountFrequency (%) 
124493731.1%
 
29664112.3%
 
3605327.7%
 
4436815.6%
 
5340364.3%
 
ValueCountFrequency (%) 
3691< 0.1%
 
3681< 0.1%
 
3671< 0.1%
 
3661< 0.1%
 
3651< 0.1%
 

is_failed
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.0 MiB
0
761833 
1
 
24767
ValueCountFrequency (%) 
076183396.9%
 
1247673.1%
 
2020-10-10T14:32:44.819063image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

voucher_amount
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct911
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.09148909292
Minimum0
Maximum93.3989
Zeros743462
Zeros (%)94.5%
Memory size6.0 MiB
2020-10-10T14:32:45.107798image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0.686
Maximum93.3989
Range93.3989
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4795579176
Coefficient of variation (CV)5.241694963
Kurtosis3886.352852
Mean0.09148909292
Median Absolute Deviation (MAD)0
Skewness30.39394065
Sum71965.32049
Variance0.2299757963
MonotocityNot monotonic
2020-10-10T14:32:45.555168image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
074346294.5%
 
1.029116471.5%
 
1.715111341.4%
 
2.05891221.2%
 
0.68636480.5%
 
1.37217700.2%
 
2.74411920.2%
 
2.57258970.1%
 
3.435430.1%
 
0.5145373< 0.1%
 
Other values (901)28120.4%
 
ValueCountFrequency (%) 
074346294.5%
 
0.0034335< 0.1%
 
0.284691< 0.1%
 
0.322421< 0.1%
 
0.34319< 0.1%
 
ValueCountFrequency (%) 
93.39891< 0.1%
 
78.029071< 0.1%
 
68.39421< 0.1%
 
61.825751< 0.1%
 
37.575651< 0.1%
 

delivery_fee
Real number (ℝ≥0)

ZEROS

Distinct98
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1811799318
Minimum0
Maximum9.86
Zeros597536
Zeros (%)76.0%
Memory size6.0 MiB
2020-10-10T14:32:46.228946image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0.986
Maximum9.86
Range9.86
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.3697095668
Coefficient of variation (CV)2.040565769
Kurtosis8.481347092
Mean0.1811799318
Median Absolute Deviation (MAD)0
Skewness2.417459196
Sum142516.1343
Variance0.1366851638
MonotocityNot monotonic
2020-10-10T14:32:46.820233image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
059753676.0%
 
0.493706179.0%
 
0.986357354.5%
 
0.7395347904.4%
 
0.246576641.0%
 
1.232571640.9%
 
1.47967680.9%
 
1.429750780.6%
 
0.4683530970.4%
 
0.443726570.3%
 
Other values (88)154942.0%
 
ValueCountFrequency (%) 
059753676.0%
 
0.0246510< 0.1%
 
0.04933< 0.1%
 
0.09864< 0.1%
 
0.1479303< 0.1%
 
ValueCountFrequency (%) 
9.861< 0.1%
 
7.3951< 0.1%
 
6.65551< 0.1%
 
6.4091< 0.1%
 
5.9161< 0.1%
 

amount_paid
Real number (ℝ≥0)

Distinct6471
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.18327131
Minimum0
Maximum1131.03
Zeros872
Zeros (%)0.1%
Memory size6.0 MiB
2020-10-10T14:32:47.370493image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4.5135
Q16.64812
median9.027
Q312.213
95-th percentile19.5408
Maximum1131.03
Range1131.03
Interquartile range (IQR)5.56488

Descriptive statistics

Standard deviation5.6181212
Coefficient of variation (CV)0.5517010233
Kurtosis2243.912588
Mean10.18327131
Median Absolute Deviation (MAD)2.655
Skewness15.5881411
Sum8010161.21
Variance31.56328582
MonotocityNot monotonic
2020-10-10T14:32:47.890731image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
5.31146671.9%
 
7.965144101.8%
 
6.372118781.5%
 
8.496103501.3%
 
6.90399881.3%
 
5.84197341.2%
 
9.02792131.2%
 
7.43491561.2%
 
10.6289821.1%
 
9.55883771.1%
 
Other values (6461)67984586.4%
 
ValueCountFrequency (%) 
08720.1%
 
0.005311< 0.1%
 
0.015931< 0.1%
 
0.026551< 0.1%
 
0.037171< 0.1%
 
ValueCountFrequency (%) 
1131.031< 0.1%
 
581.71051< 0.1%
 
363.018151< 0.1%
 
353.38051< 0.1%
 
246.888451< 0.1%
 

restaurant_id
Real number (ℝ≥0)

Distinct13569
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean162864079.3
Minimum73498
Maximum340453498
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-10T14:32:48.562108image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum73498
5-th percentile29803498
Q186023498
median169613498
Q3228433498
95-th percentile302393498
Maximum340453498
Range340380000
Interquartile range (IQR)142410000

Descriptive statistics

Standard deviation87830821.23
Coefficient of variation (CV)0.5392890906
Kurtosis-1.08595334
Mean162864079.3
Median Absolute Deviation (MAD)71240000
Skewness-0.02254910338
Sum1.281088848e+14
Variance7.714253157e+15
MonotocityNot monotonic
2020-10-10T14:32:49.228644image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3762349813170.2%
 
98349810710.1%
 
19267349810310.1%
 
1545434989990.1%
 
887734989670.1%
 
1467234989420.1%
 
1052534989350.1%
 
186034989220.1%
 
306334989180.1%
 
295934988820.1%
 
Other values (13559)77661698.7%
 
ValueCountFrequency (%) 
73498120< 0.1%
 
12349837< 0.1%
 
153498193< 0.1%
 
173498181< 0.1%
 
19349884< 0.1%
 
ValueCountFrequency (%) 
3404534981< 0.1%
 
3400934982< 0.1%
 
3400334981< 0.1%
 
3399834982< 0.1%
 
3399134981< 0.1%
 

city_id
Real number (ℝ≥0)

Distinct3749
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47179.7505
Minimum230
Maximum100205
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-10T14:32:50.008616image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum230
5-th percentile10346
Q124799
median46467
Q367886
95-th percentile89749
Maximum100205
Range99975
Interquartile range (IQR)43087

Descriptive statistics

Standard deviation25904.63056
Coefficient of variation (CV)0.5490624747
Kurtosis-1.018564164
Mean47179.7505
Median Absolute Deviation (MAD)21419
Skewness0.05185593619
Sum3.711159174e+10
Variance671049884.7
MonotocityNot monotonic
2020-10-10T14:32:50.645294image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
103468665411.0%
 
20326362104.6%
 
80562341004.3%
 
50898216272.7%
 
40441167322.1%
 
60537147601.9%
 
44366141191.8%
 
45358112461.4%
 
4334111061.4%
 
90633104491.3%
 
Other values (3739)52959767.3%
 
ValueCountFrequency (%) 
2309930.1%
 
129865190.8%
 
167677< 0.1%
 
168533< 0.1%
 
168918< 0.1%
 
ValueCountFrequency (%) 
1002051< 0.1%
 
1000791< 0.1%
 
1000613< 0.1%
 
10004856< 0.1%
 
999995< 0.1%
 

payment_id
Real number (ℝ≥0)

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1668.509077
Minimum1491
Maximum1811
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-10T14:32:51.195054image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1491
5-th percentile1523
Q11619
median1619
Q31779
95-th percentile1779
Maximum1811
Range320
Interquartile range (IQR)160

Descriptive statistics

Standard deviation87.19266546
Coefficient of variation (CV)0.05225783105
Kurtosis-1.011622604
Mean1668.509077
Median Absolute Deviation (MAD)0
Skewness0.2658271582
Sum1312449240
Variance7602.56091
MonotocityNot monotonic
2020-10-10T14:32:51.544222image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
161947660060.6%
 
177923413329.8%
 
1491364974.6%
 
1811344924.4%
 
152348780.6%
 
ValueCountFrequency (%) 
1491364974.6%
 
152348780.6%
 
161947660060.6%
 
177923413329.8%
 
1811344924.4%
 
ValueCountFrequency (%) 
1811344924.4%
 
177923413329.8%
 
161947660060.6%
 
152348780.6%
 
1491364974.6%
 

platform_id
Real number (ℝ≥0)

SKEWED

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29868.52938
Minimum525
Maximum30423
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-10T14:32:51.941578image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum525
5-th percentile29463
Q129463
median29815
Q330231
95-th percentile30359
Maximum30423
Range29898
Interquartile range (IQR)768

Descriptive statistics

Standard deviation1160.893265
Coefficient of variation (CV)0.03886677012
Kurtosis565.3036862
Mean29868.52938
Median Absolute Deviation (MAD)352
Skewness-22.53663783
Sum2.349458521e+10
Variance1347673.174
MonotocityNot monotonic
2020-10-10T14:32:52.343796image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%) 
2946324152330.7%
 
3023121672627.6%
 
2981515897220.2%
 
3035910365313.2%
 
30391244343.1%
 
29751193212.5%
 
29495111511.4%
 
3042368190.9%
 
3019920790.3%
 
52510940.1%
 
Other values (4)8280.1%
 
ValueCountFrequency (%) 
52510940.1%
 
221673< 0.1%
 
22263232< 0.1%
 
222951< 0.1%
 
2946324152330.7%
 
ValueCountFrequency (%) 
3042368190.9%
 
30391244343.1%
 
3035910365313.2%
 
3023121672627.6%
 
3019920790.3%
 

transmission_id
Real number (ℝ≥0)

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4253.246112
Minimum212
Maximum21124
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-10T14:32:52.739852image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum212
5-th percentile4228
Q14228
median4324
Q34356
95-th percentile4356
Maximum21124
Range20912
Interquartile range (IQR)128

Descriptive statistics

Standard deviation572.8556657
Coefficient of variation (CV)0.1346866959
Kurtosis176.6261099
Mean4253.246112
Median Absolute Deviation (MAD)32
Skewness-0.9114324558
Sum3345603392
Variance328163.6137
MonotocityNot monotonic
2020-10-10T14:32:53.085340image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
435634173443.4%
 
432420366825.9%
 
422820161725.6%
 
4260145381.8%
 
212126761.6%
 
499667370.9%
 
419652760.7%
 
1988207< 0.1%
 
21124146< 0.1%
 
20201< 0.1%
 
ValueCountFrequency (%) 
212126761.6%
 
1988207< 0.1%
 
20201< 0.1%
 
419652760.7%
 
422820161725.6%
 
ValueCountFrequency (%) 
21124146< 0.1%
 
499667370.9%
 
435634173443.4%
 
432420366825.9%
 
4260145381.8%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.0 MiB
1
408889 
0
377711 
ValueCountFrequency (%) 
140888952.0%
 
037771148.0%
 
2020-10-10T14:32:53.340977image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Interactions

2020-10-10T14:31:01.786223image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:02.928397image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:03.780704image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:04.763101image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:05.712622image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:06.574178image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:07.408320image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:08.345481image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:09.184031image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:10.053168image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:10.900904image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:11.711370image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:12.546532image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:13.420048image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:14.283056image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:15.177684image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:16.106531image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:16.982591image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:17.845439image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:18.788860image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:19.595298image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:20.421538image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:21.236832image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:22.041813image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:22.847850image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:23.585307image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:24.371064image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:25.133215image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:26.197507image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:26.947368image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:27.685891image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:28.420941image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:29.185125image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:29.931854image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:30.706632image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:31.527467image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:32.295131image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:32.963874image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:33.646916image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:34.334648image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:35.026794image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:35.689482image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:36.404787image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:37.114140image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:37.856676image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:38.648507image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:39.631069image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:40.383172image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:41.028267image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:41.811258image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:42.503115image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:43.162586image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:43.902389image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:44.599765image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:45.354425image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:46.323687image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:47.193218image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:48.088617image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:48.838739image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:49.598989image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:50.438672image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:51.114540image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:51.895827image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:52.787077image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:54.023262image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:54.722886image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:55.507028image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:56.408732image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:57.357049image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:58.358057image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:31:59.539657image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:00.557738image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:01.491543image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:02.678521image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:03.616305image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:04.610859image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:05.664978image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:06.770996image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:07.891922image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:08.847753image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:09.725590image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:10.495528image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:11.263817image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:12.043278image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:12.823461image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:13.573589image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:14.386734image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:15.156165image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:15.967704image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:16.721478image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:17.520084image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:18.253407image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:19.028776image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:19.846598image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:20.628091image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:21.335477image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:22.074265image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:22.781591image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:23.488732image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:24.205724image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-10-10T14:32:53.621429image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-10-10T14:32:54.461870image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-10-10T14:32:55.244383image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-10-10T14:32:56.131783image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-10-10T14:32:26.593189image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:30.155364image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-10T14:32:36.824122image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

customer_idorder_dateorder_hourcustomer_order_rankis_failedvoucher_amountdelivery_feeamount_paidrestaurant_idcity_idpayment_idplatform_idtransmission_idis_returning_customer
0000097eabfd92015-06-20191.000.00.00011.4696058034982032617793023143560
10000e2c6d9be2016-01-29201.000.00.0009.558002393034987654716193035943560
2000133bb597f2017-02-26191.000.00.4935.936582064634983383316193035943241
300018269939b2017-02-05171.000.00.4939.82350366134989931516193035943560
40001a00468a62015-08-04191.000.00.4935.150702258534981645616192946343560
50001d9036b5e2015-08-29191.000.00.00011.947501936434988827616192946343560
60001d9036b5e2017-01-04172.000.00.00011.151001936434988827616192946343560
70001d9036b5e2017-01-28163.000.00.0009.717301936434988827616193035943560
80001e1e04d7d2015-10-24191.000.00.00025.222501448334984535816192946343561
90001e1e04d7d2016-03-24192.000.00.0009.29250959534984535816192946343241

Last rows

customer_idorder_dateorder_hourcustomer_order_rankis_failedvoucher_amountdelivery_feeamount_paidrestaurant_idcity_idpayment_idplatform_idtransmission_idis_returning_customer
786590fffcf45e5c692016-11-19121.000.00.000012.531601074634983933516192946343560
786591fffcf45e5c692017-02-04122.000.00.000011.575801074634983933516193035943560
786592fffd696eaedd2015-09-14121.000.01.429724.13395953234988056217792946343560
786593fffe9d5a8d412016-07-3121NaN10.00.00008.44290156133498103461811294632121
786594fffe9d5a8d412016-09-30201.000.00.000010.726209834981034617792946342281
786595fffe9d5a8d412016-09-3020NaN10.00.000010.72620983498103461779294632121
786596ffff347c3cfa2016-08-17211.000.00.00007.59330528934984197816193035943561
786597ffff347c3cfa2016-09-15212.000.00.00005.947201646534984197816193035943561
786598ffff4519b52d2016-04-02191.000.00.000021.77100163634988056214912975142280
786599ffffccbfc8a42015-05-30201.000.00.000016.461001502934984595216192946343240